智能论文笔记

A novel adversarial learning strategy for medical image classification

Zong Fan , Xiaohui Zhang , Jacob A. Gasienica , Jennifer Potts , Su Ruan , Wade Thorstad , Hiram Gay , Xiaowei Wang , Hua Li

分类：计算机视觉

2022-06-23

深度学习（DL）技术已被广泛用于医学图像分类。大多数基于DL的分类网络通常是层次结构化的，并通过最小化网络末尾测量的单个损耗函数而进行了优化。但是，这种单一的损失设计可能会导致优化一个特定的感兴趣价值，但无法利用中间层的信息特征，这些特征可能会受益于分类性能并降低过度拟合的风险。最近，辅助卷积神经网络（AUXCNNS）已在传统分类网络之上采用，以促进中间层的培训，以提高分类性能和鲁棒性。在这项研究中，我们提出了一个基于对抗性学习的AUXCNN，以支持对医学图像分类的深神经网络的培训。我们的AUXCNN分类框架采用了两项主要创新。首先，所提出的AUXCNN体系结构包括图像发生器和图像鉴别器，用于为医学图像分类提取更多信息图像特征，这是由生成对抗网络（GAN）的概念及其在近似目标数据分布方面令人印象深刻的能力的动机。其次，混合损失函数旨在通过合并分类网络和AUXCNN的不同目标来指导模型训练，以减少过度拟合。全面的实验研究表明，提出的模型的分类表现出色。研究了与网络相关因素对分类性能的影响。

translated by 谷歌翻译

In-Sensor & Neuromorphic Computing are all you need for Energy Efficient Computer Vision

Gourav Datta , Zeyu Liu , Md Abdullah-Al Kaiser , Souvik Kundu , Joe Mathai , Zihan Yin , Ajey P. Jacob , Akhilesh R. Jaiswal , Peter A. Beerel

分类：计算机视觉

2022-12-21

Due to the high activation sparsity and use of accumulates (AC) instead of expensive multiply-and-accumulates (MAC), neuromorphic spiking neural networks (SNNs) have emerged as a promising low-power alternative to traditional DNNs for several computer vision (CV) applications. However, most existing SNNs require multiple time steps for acceptable inference accuracy, hindering real-time deployment and increasing spiking activity and, consequently, energy consumption. Recent works proposed direct encoding that directly feeds the analog pixel values in the first layer of the SNN in order to significantly reduce the number of time steps. Although the overhead for the first layer MACs with direct encoding is negligible for deep SNNs and the CV processing is efficient using SNNs, the data transfer between the image sensors and the downstream processing costs significant bandwidth and may dominate the total energy. To mitigate this concern, we propose an in-sensor computing hardware-software co-design framework for SNNs targeting image recognition tasks. Our approach reduces the bandwidth between sensing and processing by 12-96x and the resulting total energy by 2.32x compared to traditional CV processing, with a 3.8% reduction in accuracy on ImageNet.

translated by 谷歌翻译

Roboflow 100: A Rich, Multi-Domain Object Detection Benchmark

Floriana Ciaglia , Francesco Saverio Zuppichini , Paul Guerrie , Mark McQuade , Jacob Solawetz

分类：计算机视觉

2022-11-24

The evaluation of object detection models is usually performed by optimizing a single metric, e.g. mAP, on a fixed set of datasets, e.g. Microsoft COCO and Pascal VOC. Due to image retrieval and annotation costs, these datasets consist largely of images found on the web and do not represent many real-life domains that are being modelled in practice, e.g. satellite, microscopic and gaming, making it difficult to assert the degree of generalization learned by the model. We introduce the Roboflow-100 (RF100) consisting of 100 datasets, 7 imagery domains, 224,714 images, and 805 class labels with over 11,170 labelling hours. We derived RF100 from over 90,000 public datasets, 60 million public images that are actively being assembled and labelled by computer vision practitioners in the open on the web application Roboflow Universe. By releasing RF100, we aim to provide a semantically diverse, multi-domain benchmark of datasets to help researchers test their model's generalizability with real-life data. RF100 download and benchmark replication are available on GitHub.

translated by 谷歌翻译

A Reinforcement Learning Approach to Optimize Available Network Bandwidth Utilization

Hasibul Jamil , Elvis Rodrigues , Jacob Goldverg , Tevfik Kosar

分类：人工智能

2022-11-22

Efficient data transfers over high-speed, long-distance shared networks require proper utilization of available network bandwidth. Using parallel TCP streams enables an application to utilize network parallelism and can improve transfer throughput; however, finding the optimum number of parallel TCP streams is challenging due to nondeterministic background traffic sharing the same network. Additionally, the non-stationary, multi-objectiveness, and partially-observable nature of network signals in the host systems add extra complexity in finding the current network condition. In this work, we present a novel approach to finding the optimum number of parallel TCP streams using deep reinforcement learning (RL). We devise a learning-based algorithm capable of generalizing different network conditions and utilizing the available network bandwidth intelligently. Contrary to rule-based heuristics that do not generalize well in unknown network scenarios, our RL-based solution can dynamically discover and adapt the parallel TCP stream numbers to maximize the network bandwidth utilization without congesting the network and ensure fairness among contending transfers. We extensively evaluated our RL-based algorithm's performance, comparing it with several state-of-the-art online optimization algorithms. The results show that our RL-based algorithm can find near-optimal solutions 40% faster while achieving up to 15% higher throughput. We also show that, unlike a greedy algorithm, our devised RL-based algorithm can avoid network congestion and fairly share the available network resources among contending transfers.

translated by 谷歌翻译

The Technological Emergence of AutoML: A Survey of Performant Software and Applications in the Context of Industry

Alexander Scriven , David Jacob Kedziora , Katarzyna Musial , Bogdan Gabrys

分类：机器学习 | 人工智能

2022-11-08

With most technical fields, there exists a delay between fundamental academic research and practical industrial uptake. Whilst some sciences have robust and well-established processes for commercialisation, such as the pharmaceutical practice of regimented drug trials, other fields face transitory periods in which fundamental academic advancements diffuse gradually into the space of commerce and industry. For the still relatively young field of Automated/Autonomous Machine Learning (AutoML/AutonoML), that transitory period is under way, spurred on by a burgeoning interest from broader society. Yet, to date, little research has been undertaken to assess the current state of this dissemination and its uptake. Thus, this review makes two primary contributions to knowledge around this topic. Firstly, it provides the most up-to-date and comprehensive survey of existing AutoML tools, both open-source and commercial. Secondly, it motivates and outlines a framework for assessing whether an AutoML solution designed for real-world application is 'performant'; this framework extends beyond the limitations of typical academic criteria, considering a variety of stakeholder needs and the human-computer interactions required to service them. Thus, additionally supported by an extensive assessment and comparison of academic and commercial case-studies, this review evaluates mainstream engagement with AutoML in the early 2020s, identifying obstacles and opportunities for accelerating future uptake.

translated by 谷歌翻译

CompNet: A Designated Model to Handle Combinations of Images and Designed features

Bowen Qiu , Daniela Raicu , Jacob Furst , Roselyne Tchoua

分类：计算机视觉 | 人工智能

2022-09-28

卷积神经网络（CNN）是计算机视觉（CV）中最受欢迎的人工神经网络（ANN）的模型之一。研究人员开发了各种基于CNN的结构，以解决图像分类，对象检测和图像相似性测量等问题。尽管CNN在大多数情况下显示出其价值，但它们仍然有缺点：当数据集中没有足够的样本时，它们很容易过度。大多数医疗图像数据集是此类数据集的示例。此外，许多数据集还包含设计的功能和图像，但是CNN只能直接处理图像。这是一个错过的机会来利用其他信息。因此，我们提出了一种基于CNN的模型的新结构：Compnet，一个复合卷积神经网络。这是一个专门设计的神经网络，可以接受图像和设计功能的组合作为输入，以利用所有可用信息。这种结构的新颖性是，它使用从图像到重量设计的功能学习的功能，以便从图像和设计功能中获取所有信息。随着该结构在分类任务上的使用，结果表明我们的方法有能力显着减少过度拟合。此外，我们还发现了其他研究人员提出的几种类似的方法，可以结合图像和设计功能。为了进行比较，我们首先在LIDC上应用了这些类似的方法，并将结果与Compnet结果进行了比较，然后我们将COMPNET应用于数据集中，这些方法最初在其作品中最初使用，并将结果与他们在论文中提出的结果进行了比较。。所有这些比较结果表明，我们的模型在LIDC数据集或其提议的数据集上的分类任务上优于这些类似的方法。

translated by 谷歌翻译

Airway measurement by refinement of synthetic images improves mortality prediction in idiopathic pulmonary fibrosis

Ashkan Pakzad , Mou-Cheng Xu , Wing Keung Cheung , Marie Vermant , Tinne Goos , Laurens J De Sadeleer , Stijn E Verleden , Wim A Wuyts , John R Hurst , Joseph Jacob

分类：计算机视觉

2022-08-30

几种慢性肺疾病，例如特发性肺纤维化（IPF）的特征是气道异常扩张。计算机断层扫描（CT）上气道特征的定量可以帮助表征疾病进展。已经开发了基于物理的气道测量算法，但由于在临床实践中看到的气道形态多样性，因此取得了有限的成功。由于获得精确的气道注释的高成本，监督学习方法也不可行。我们建议使用感知损失通过样式转移进行综合气道，以训练我们的模型气道转移网络（ATN）。我们使用a）定性评估将ATN模型与最先进的GAN网络（SIMGAN）进行比较； b）评估基于ATN和SIMGAN的CT气道指标预测113例IPF患者死亡率的能力。与Simgan相比，ATN被证明更快，更容易训练。还发现基于ATN的气道测量值始终比IPF CTS上的SIMGAN衍生气道指标更强大。通过转化网络使用感知损失来完善合成数据的转化网络是基于GAN的方法的现实替代方法，用于用于特发性肺纤维化的临床CT分析。我们的源代码可以在https://github.com/ashkanpakzad/atn上找到，该源代码与Airquant的现有开放源气道分析框架兼容。

translated by 谷歌翻译

HTML版本

A Letter on Progress Made on Husky Carbon: A Legged-Aerial, Multi-modal Platform

Adarsh Salagame , Shoghair Manjikian , Chenghao Wang , Kaushik Venkatesh Krishnamurthy , Shreyansh Pitroda , Bibek Gupta , Tobias Jacob , Benjamin Mottis , Eric Sihite , Milad Ramezani

分类：机器人

2022-07-25

鸟类等动物通过将腿部和空中迁移率与显性惯性作用相结合，广泛使用多模式运动。这种多模式运动壮举的机器人仿生型可以在协商其任务空间的能力方面产生超虚拟系统。本文的主要目的是讨论实现多模式运动的挑战，并报告我们在开发能够多模式运动（腿部和空中运动）的四足动物机器人方面的进展。我们报告了机器人中使用的机械和电气组件，除了为开发多功能多模式机器人平台实现目标的模拟和实验外。

translated by 谷歌翻译

Language Model Cascades

David Dohan , Winnie Xu , Aitor Lewkowycz , Jacob Austin , David Bieber , Raphael Gontijo Lopes , Yuhuai Wu , Henryk Michalewski , Rif A. Saurous , Jascha Sohl-dickstein

分类：自然语言处理 | 人工智能

2022-07-21

促使模型表现出令人印象深刻的几次学习能力。在测试时间与单个模型或多个模型的组成一起重复相互作用，进一步扩展了功能。这些组成是概率模型，可以用具有随机变量的图形模型的语言表示，其值是复杂的数据类型，例如字符串。具有控制流和动态结构的情况需要概率编程的技术，这些技术允许以统一语言实施不同的模型结构和推理策略。我们从这个角度正式化了几种现有技术，包括刮擦板 /思想链，验证者，星星，选择 - 推动和工具使用。我们将结果程序称为语言模型级联。

translated by 谷歌翻译

A Massively-Parallel 3D Simulator for Soft and Hybrid Robots

Joel Clay , Sofia Wyetzner , Alex Gaudio , Boxi Xia , Andrew Moshova , Jacob Austin , Max Segan , Hod Lipson

分类：机器人

2022-07-19

仿真是用于创建控制策略和测试各种物理参数的机器人技术的重要步骤。 Soft Robotics是一个领域，由于可变形材料组件的非线性以及其他创新性且通常是复杂的物理特性而引起了独特的物理挑战，以模拟其主题。由于使用传统技术模拟柔软和异质物体的计算成本，刚性机器人模拟器不太适合模拟软机器人。因此，许多工程师必须构建自己为系统量身定制的一次性模拟器，或使用具有降低性能的现有模拟器。为了促进这项激动人心的技术的开发，这项工作为各种软机器人提供了交互式，准确和多功能的模拟器。我们的开源3D仿真引擎Cronos与可变形和刚性对象的超快速性能的质量弹簧模型平行。我们的方法适用于多种非线性材料构型，包括高变形性，体积致动或异质刚度。这种多功能性提供了在单个机器人模拟中自由混合材料和几何成分的能力。通过利用非线性胡克恩质量弹簧系统的灵活性和可扩展性，该框架通过高度并行模型模拟柔软而刚性的对象，以实现近实时速度。我们描述了有效的GPU CUDA实施，我们证明了该实施是为了在消费级GPU卡上实现每秒超过10亿个元素的计算。通过将结果与Euler-Bernoulli光束理论，固有频率预测和软结构在大变形下的软结构进行比较来验证系统的动态物理准确性。

translated by 谷歌翻译